Conversation
74b2ed4 to
b8a9132
Compare
|
@fs-eire, thank you for adding these utilities. We currently have some plugin EP utilities in this directory: https://github.com/microsoft/onnxruntime/tree/main/include/onnxruntime/core/providers/utils And there's another on the way here: #25753 Could we consolidate all of these utilities into one location? |
include/onnxruntime/ep/README.md
Outdated
|
|
||
| ### Usage | ||
|
|
||
| Make sure to include "ep/_pch.h" for all source code in the implementation. Using PCH is recommended. |
There was a problem hiding this comment.
Using PCH is recommended.
does this mean something other than including "_pch.h"? it's not obvious to me.
perhaps I'm just not understanding the name. could you either explain it more here or give _pch.h a more descriptive name?
There was a problem hiding this comment.
_pch.h is the file, but we want to actually use compiler flags to specify the file as a PCH to enforce.
I probably should write Using PCH compiler flag is recommended.
There was a problem hiding this comment.
updated comments to make it clear.
|
|
||
| #pragma once | ||
|
|
||
| #include "core/framework/allocator.h" |
There was a problem hiding this comment.
naive question - is this header intended to be used by a plugin EP which doesn't have access to internal ORT code? if so, how can we use internal ORT headers?
There was a problem hiding this comment.
No. The whole purpose of the folder include/onnxruntime/ep/adapter is designed for a very specific goal: to support the existing EPs (majorly WebGPU EP and CUDA EP) to be able to use the EP API, with minimal code changes.
This means most of the existing code can be kept as-is. For some components, we need a "reverse bridge" to support an implementation of OrtAllocator which wraps an ORT internal allocator, like this.
it may be worth considering whether the utility is generally useful to any plugin EP vs. useful to plugin EPs within the ORT repo. some of these seem like the latter. |
I agree. we should not have separated implementation in different places of the repo. In general, I think there are 2 different types of headers:
I would recommend to put them somewhere under All other details are open to discussion. |
| auto* allocator = static_cast<const Allocator*>(this_ptr); | ||
| return allocator->memory_info_; | ||
| } | ||
|
|
There was a problem hiding this comment.
Just want to double check that AllocOnStream is not also needed for cuda or webgpu EPs.
There was a problem hiding this comment.
we can always extend those classes to support more function that may be missing for CUDA. The classes are generally designed to be generic enough for being reused in a future migration for CUDA EP.
| inline const DataTransferManager& GetDataTransferManager() const noexcept { | ||
| return data_transfer_manager_; | ||
| } | ||
| [[nodiscard]] Status GetTempSpaceCPUAllocator(AllocatorPtr* output) const { |
There was a problem hiding this comment.
GetTempSpaceCPUAllocator and GetTempSpaceAllocator are internally implemented by onnxruntime::OpKernelContext. Would it be possible/appropriate to add public C APIs to OrtKernelContext for these functions instead of including them here?
There was a problem hiding this comment.
I would prefer to keep it in the current way.
There are 2 reasons:
-
performance consideration. Here,
onnxruntime::ep::adapter::EP::GetTempSpaceCPUAllocatoraccepts aAllocatorPtr*, which is a pointer to an internal type. We have access to the reference to theWebGpuExecutionProviderinstance any way so it's not a problem of getting the actual allocator pointer returned by it.
If we change that to use an allocator returned by C API, then we have to useonnxruntime::ep::adapter::Allocatorto wrap it (which has some overhead). -
for usage inside
include/onnxruntime/ep/adapter, there is no point to do this as long as WebGPU EP creates the allocator and use it both without the C API.
| return ToOrtStatus(status); | ||
| } | ||
| *out = nullptr; | ||
| status = kernel->CreateControlFlowKernelImpl(info, out); |
There was a problem hiding this comment.
nit: maybe add a comment explaining why we try to create a control flow kernel first.
There was a problem hiding this comment.
added, please review the comments.
b8a9132 to
c9e4973
Compare
|
|
||
| ### Usage | ||
|
|
||
| Make sure to include "ep/_pch.h" for all source code in the implementation. Using PCH compiler flag is recommended. |
There was a problem hiding this comment.
I still don't understand the name _pch.h. maybe you can explain it to me. or could we name it something like ep/adapters.h?
There was a problem hiding this comment.
we are using EP_SPECIFIC_USING_DECLARATIONS to "override" some ORT classes with our adapter classes. This requires the header file to be included before any header that defines the old ORT classes. The best way to achieve this is to use the PCH (pre-compiled header), which is guaranteed by the compiler to ensure the file is included first.
we do have some 'pch.h' in winml folder, a 'test_pch.h' in cmake folder, and a PCH for CUDA EP: onnxruntime\core\providers\cuda\cuda_pch.h. Using the filename containing 'pch' may be a good way to indicating that the file is used as PCH explicitly.
The reason why the file is not inside adapter folder is because it depends on the shared plugin EP headers. Before we refactored them we probably need to keep it because otherwise #include "../common.h" seems not a very good way.
| @@ -0,0 +1,7 @@ | |||
| ## EP adapter | |||
|
|
|||
| This folder contains a set of C++ header files. They are used specifically for allowing ONNX Runtime internal kernel-based EPs to use the plugin-style EP API while keep minimal changes to existing code. | |||
There was a problem hiding this comment.
this readme is not located in the adapters directory. maybe move it there, or explicitly name the adapters directory.
There was a problem hiding this comment.
I added a section in the README to explain the folder structure. Since there is an ongoing discussion about unifying the shared headers, I expect the current folder structure to be refactored later.
There was a problem hiding this comment.
in future, when we have a dedicated folder for the shared plugin EP headers, we can put everything in the adapter directory.
Description
This PR adds a few headers for supporting building WebGPU EP and CUDA EP as plugin EPs.
See summary of #26907